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ABSTRACT 

Using symmetric boundary conditions at separated times, I show analyti- 
cally that both the time ordering of (macroscopic) causality and the direction of 
entropy increase follow from these boundary conditions. In particular, when the 
endpoints have low entropy, these arrows of time point away from the ends and 
toward the middle. Causality in this context means that when perturbations are 
applied, the effect of the perturbation — the macroscopic change in the system's 
behavior — is confined to one temporal side of the perturbations. These results 
hold for both mixing and integrable systems, although relaxation for integrable 
systems is incomplete. Simulations are presented for purposes of illustration. 



1. Introduction 

By "causality" I mean that if a system is perturbed the macroscopic effect 
occurs subsequent to the perturbation. There is a lot of baggage in this definition. 
First, I am not talking about the microscopic causality of relativistic quantum field 
theory, which is a statement about the vanishing of commutators (or anticommu- 
tators) at spacelike separations. Second, I am trying to avoid the many and subtle 
definitions that have appeared in the philosophical literature, some of which are 
close to mine, some of which are not. Then there is the word "perturbation," which 
suggests a kind of control or free will. Finally, there is the term "macroscopic," 
equivalent to a notion of coarse grains, yet another nontrivial concept. 

Defining causality in terms of sequential order emphasizes its relation to the 
thermodynamic arrow of time. Indeed, some consider causality (with similar mean- 
ing and baggage) to be the primary concept [1, pp. 163-164], with other kinds of 
ordering (in particular, the second law of thermodynamics) consequences of it. 

I will take neither of these concepts to be primary, and will instead derive both 
from a model, or caricature, of the expansion of the universe. This follows the 
ideas of Gold [2] and my own elaboration of them [3,4], in particular emphasizing 
the notion of two-time boundary conditions. It is clear that it would be pointless 



to study causality as denned above using initial values for macroscopic problems, 
since such a formulation forces the effect of a perturbation to be subsequent. So 
in studying causality, as in studying the arrow of time, one should formulate the 
problem time-symmetrically if one's conclusions are to be noncircular. 

I will find that both macroscopic causality and the second law, meaning entropy 
increase, can be derived in the appropriate two-time boundary condition context. 
For sufficiently chaotic dynamical systems both features flow naturally from the 
formalism. For integrable systems, relaxation can be imposed by averaging over 
frequencies. But with future conditioning additional time scales enter, and while 
one can still get relaxation and causality, there is not the same simplicity as for 
chaotic systems. 

In Sec. 2 I introduce the general context for this discussion as well as notation. 
In the following section there is an analytic derivation of symmetric entropy increase 
for systems having appropriate two-time boundary conditions. Causality, treated in 
Sec. 4, is established using the same methods. In the last section numerical work is 
shown to illustrate the results of the previous sections. There are two appendices. 
In the first I give a general derivation of entropy increase when coarse graining is 
implemented at each time step. This is a master equation approach and is mainly 
pedagogical. In the second appendix I indicate how the notion of "perturbation" 
need not depend on philosophical questions concerning free will. 

2. Framework and notation 

As in previous publications [3-6], my context is a two time boundary value 
problem in which macroscopic data are specified at an early time, "0," and a late 
time, "T." At both boundary times the system is in a restricted state (i.e., low 
entropy). For the systems previously studied the entropy increases, with varying 
degrees of monotonicity, away from both boundary times. Moreover, for chaotic 
systems, if the interval between the boundary times exceeds twice the system's 
relaxation time, the initial relaxation is macroscopically indistinguishable from 
■unconditioned time evolution. Furthermore, the evolution away from the final point 
(i.e., from T to smaller values of the parameter t) is the symmetric image of the 
initial relaxation — this assertion is true even with conditions on the time evolution 
that are weaker than time reversal invariance. All these features have been used to 
argue for Gold's thesis. In some of the references above I have elaborated on my 
rationale for taking this approach, and will not repeat the argument here. Most of 
my previous demonstrations have used the cat map [7] as the dynamical system, 
and computer simulations to provide the evidence. 

In this article I will argue more generally, extending both the systems studied 
and the method of justification. In effect this explains why the simulations work, 
although a discussion without explicit equations occurs in [4] and embodies the 
essential ideas to be presented below. 
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Consider classical mechanics on a (phase) space O. Let p be the measure on 
O and /i(O) = 1. Let the dynamics be given by a measure-preserving map (p^ on 
fi, with (a;) the time-t image of an initial point u E O. The time parameter, t, 
may be either continuous or discrete. 

The notion of "macroscopic" is provided by a coarse graining on O. This is a 
finite set of sets of strictly positive measure that cover O: {A Q }, a = 1, . . . , G, with 
Uq,A q = Q, A a fl Ap = for a ^ f3. Let Xa be the characteristic function of A a 
and let v a = /i(A Q ). If / is a function on Q, its coarse graining is defined to be 

f {u}) = J2^1f a = J2x«(f)a, with/ Q = /^ X aM/M, (f) a =-. 

(1) 

Thus (/) Q is the average of / on A Q and f Q f = Jq f '■ 

Let the system's distribution in O be described by a density function p(oj). One 
can think of this distribution in more than one way. In terms of the ideal gas of 
cat map atoms that I have used before, p can be thought of as the density of atoms 
on the phase space, J 2 , on which a single cat map lives. (In this case p is a sum 
of 5-functions.) More generally, one can think of O as the 6iV-dimensional phase 
space of iV particles in three space dimensions. In this way there is no restriction 
in allowing interactions among the particles. 

If one takes a primitive notion of entropy as 



Sprim = - / P(U) lOg(p(u)) dp , 



then S piiin is constant in time, trivially by virtue of the measure preserving property 
of (and its invertibility) . The entropy that I will use for studying irreversibility 
involves coarse graining and is defined as 



S(p) = S prim (p) = - plogpdp. (2) 
Jo, 

It is easy to show that 

S(p) = S(p a \v a ) , 

where, as in Eq. (1), p a = f A pdp, and the function S(p\q) is the relative entropy 
defined by 



s(^) = -£^)iog(gj) , 



with p and q probability distributions such that p(x) vanishes only if q(x) does. 
Note that = J P = 1 5 and that all v a are nonzero [8]. 
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3. Time-dependence of the entropy, with and without future conditioning 

The system is required to start (t = 0) in a subset eo C O and end (t = T) in 
a subset er C O. The points of O satisfying this two-time boundary condition are 

e = e n^- T )(e T ). (3) 

The set e can be empty. I have argued though [9,4] that for chaotic dynamics and 
for sufficiently long times T there exist solutions, i.e., e ^ 0. Moreover, for such 
times 

//(e) ~ n(e )n(e T ) ■ (4) 
To see how this comes about, consider mixing dynamics. The map is mixing it 

lim p (A n <f>® (B)) = fJL(A)fi(B) (5) 

for measurable subsets A and B of O. For such systems Eq. (4) will be satisfied 
in the t — > oo limit. This limit says nothing about rates of convergence, but I will 
assume that there is some time r such that the decorrelation condition (Eq. (5)) 
holds to good accuracy for t > r [10]. The set e will therefore be nonempty for 
t >t. Under e becomes 

e(t)=^*)(e )n^- T )(6 T ). 

To calculate the entropy, the density, which was p(0) = Xe/V( e ) at time-0, must be 
coarse grained. The important quantity for the entropy calculation is 

= //(A a ne(t)) = ^(A a n^)(e )n^- T )(e T )) 

If T — t > r then the following will hold 

fi (a q n 0<*>(e o ) n ^- T) {e T )) =i i (a q n <j> {t \e Q )) i* (^- T \e T ) 



//(e) =^(e )^(^ ( - T) (eT)) 



Using the measure-preserving property of <p^\ the factors n(er) in both numerator 
and denominator of p a cancel, leading to 



Pa = A* (a« n </> (t) (e )) / //(e ) • 



This is precisely what one gets without future conditioning, so that all macroscopic 
quantities, and in particular the entropy, are indistinguishable from their uncondi- 
tioned values. 
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Working backward from time-T one obtains an analogous result. Define a 
variable s = T — t and set e(s) = e(T — s). Then 

e( S ) = ^ T - s )(e )n^- s )(€ T ). 

If s satisfies T — s > r, then when the density associated with e(s) is calculated, its 
dependence on eo will drop out. It follows that 

Pa(s) = V (V ( - s) (eT)) /Wt). 

For a time-reversal invariant dynamics this will give the entropy the same time de- 
pendence coming back from T as going forward from 0. It is interesting that the 
cat map is not strictly time-reversal invariant (by definitions of the form given in 
[11]) but, as I have shown repeatedly, its entropy as a function of time is symmetric. 
The reason is that the Lyapunov exponent is the same for the map and its inverse. 
For the cat map, there isn't much choice: the 2x2 matrix has only two eigenvalues 
and their product is unity. But I expect the similarity of macroscopic dynamics 
in both directions to obtain even for richer systems. Thus, comparing true phys- 
ical dynamics with its time-reversed counterpart, ordinary macroscopic relaxation 
should be the same, yielding symmetric entropy dependence. I justify this expecta- 
tion by the absence (so far) of any time-reversal or CP violating observations at the 
atomic level, as well as the assumption that ordinary physical relaxation processes, 
accounting for the thermodynamic arrow of our experience, occur at grosser levels 
than those at which CP violation has been detected. 

It is worth putting into words the essence of the mathematical argument just 
given. The set e is a subset of eo; which points of eo are also in e is determined by the 
choppy characteristic function of the set (f)^~ T \tT)- For long enough times, T, the 
good points of e are Poisson distributed within eo [4]. Thus following e forward in 
time (with <p) is like following a random subset of eo- But such time evolution is one 
way of studying e itself. If you wanted to do a Monte Carlo study for the evolution 
of eo, your technique would be to follow the time dependence of a random subset. 
The pseudo-randomness imposed by the characteristic function of (j)^~ T '{eT) is n ot 
worse than other kinds of pseudo-randomness. 

The same pseudo-randomness holds for e(t) (t > 0), provided the time to the 
final point, T — t, is greater than r. I have used the mixing property to argue 
for randomness, but I expect weaker conditions of ergodicity to be sufficient in 
physically relevant situations. 

Integrable systems (relaxation) 

Without mixing or some kind of ergodicity the foregoing arguments fail. How- 
ever, harmonic oscillators can be quite useful in studies of relaxation [12], although 
in previous two-time boundary value studies [3] deficiencies were noted. The gen- 
eral idea is that although an individual oscillator does not spread in phase space, if 
enough different frequencies are taken there is relaxation. 
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Rather than work with sets, as above, I consider an "ideal gas" of N oscillators, 
with oscillator #k having position and frequency u^ k \ (For convenience I take 
the period of these oscillators to be 1 (rather than 2n) and use frequency rather 

than angular frequency.) Time evolution is given by x^ k \t) = Xq + u^ k H (mod 1). 
The boundary conditions are 

< x { k) < bx & < x {k) {T) < bx , with x {k) {T) = x ( k) + u^T (mod 1) , 
u < u^ < u + bu (u does not change in time) . 

Both x^ and u^ can be randomly selected consistent with these conditions. 
From the final-time condition on x it follows that for sufficiently large T there 
is a nonempty finite set of integers {ni} so that 



< _ ^ < , (7) 

with is( k ) within the permitted range. One can plot the set of allowed initial points 
in the x-v plane. Within the rectangle [0, Sx] x [uq, u + bu] the solution points fall in 
a sequence of parallel parallelograms. Each is bounded by vertical lines at x = 0, Sx 
and by lines of slope — 1/T defining the upper and lower values for each ng (except 
that parallelograms going outside [uq, uq + bu] are cut off). For large T there will 
be many such parallelograms; for small T, few or none. 

A natural coarse graining is to divide the x-range, [0, 1], into G intervals and 
look only at values of x to compute entropy, since u does not change. For present 
purposes the boundary value quantity, Sx, will be taken smaller than 1/G. 

I first examine the equilibration of this system without future conditions. Ini- 
tially all points are in [0, 5x]; they separate from one another only by virtue of 
possessing different frequencies. Equilibration is marked by but ~ 1, leading to the 
definition of a relaxation time r osc = 1/bu. It follows that on a time scale of r osc the 
entropy will rise from — log G to (with the stated condition, bx < 1/G). 

Now consider the situation with the future conditioning of Eq. (6). For each ni 
of Eq. (7), the points of its parallelogram either remain in a single grain (Vt < T) 
or at worst overlap two. This does not contribute significantly to entropy increase. 
Rather, the separation of individual parallelograms is required for equilibration. The 
number of such parallelograms is estimated by replacing u by n/T in the second part 
of Eq. (6). It follows that the number of n values is Tbu. Neglecting grain overlap, 
this implies that the maximum entropy for the two-time boundary value problem 
is \og(Tbu) — logG. Therefore the system cannot fully relax unless T > G/bu. This 
is much longer than the unconditioned relaxation time, r osc = 1/bu, and is in sharp 
contrast to the behavior of mixing systems — for the cat map the "T" necessary for 
normal initial relaxation is only twice the usual relaxation time. 

There are further defects in the relaxation. Consider what happens when t = 
T/2. Write u = (n + (3)/T, with (3 a number on the order of bx (cf. Eq. (7)). Then 
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x ( fc )(T/2) = Xq + n/2 + (3/2. For even n this means that many of the points 
are back in the original interval, or close to it. Thus the entropy will drop. The 
same happens, but less dramatically, for other divisors. Finally, there is an inherent 
weakness in any oscillator equilibrium, in that only half the dynamical variables 
relax at all — the frequencies (v) do not change. 

In terms of the big bang-big crunch cosmological model considered in [4-6] 
these defects are probably irrelevant. It appears that the "oscillators" in our cos- 
mos are mostly the degrees of freedom of the electromagnetic field. These reach 
equilibrium through being coupled to massive matter, which presumably does re- 
lax appropriately. If they do satisfy a two-time boundary condition with, say, the 
boundary times at the decoupling epoch and its pre-big crunch partner, then I would 
not expect the timing to be so precise and coincident that one would get photon 
entropy- lowering at the cosmological midpoint (as in our "even n" condition above). 
Moreover, photons do not equilibrate very well: witness the preservation of indi- 
cations of spatial structure as deduced from cosmic background radiation. On the 
other hand, the spectrum of this radiation corresponds very well to equilibrium, the 
reason being interaction with matter prior to decoupling. 

4. Causality and peturbations 

The notion of macroscopic causality used here involves a perturbation. One 
imposes two-time boundary conditions and considers dynamical evolution with both 
unperturbed and perturbed dynamics. When solving the same boundary value 
problem, these rules will select different microscopic solutions. Although I will 
consider perturbations occurring only at a single moment in time, the microscopic 
solutions will (in general) differ everywhere. But it is the macroscopic solutions 
that allow a notion of causality. In principle macroscopic behavior could also differ 
at all times (except for the boundaries), but in a system with causality they will 
differ on only one side of the perturbation. For the usual causality, they will differ 
only after the perturbation. But we will also find that they can differ only before, 
where in this sentence and the last the words "before" and "after" are defined with 
respect to a microscopic time parameter, that, as will be seen, may differ from the 
natural thermodynamic time. 

There is a delicate point here that is discussed in Appendix B. The term 
"perturbation" suggests free will, while two-time boundary conditions sound like 
the opposite. Resolving issues of free will is not my objective, and the appendix is 
devoted to formulating the concept of perturbation in a purely physical context. 

Although I will later give examples (figures) in terms of discrete time, for 
formal purposes it is easiest to work in continuous time and to imagine that the 
perturbation is instantaneous. The time interval for the boundary value problem 
is [0,T]. Call the unperturbed system A; its history, time evolution, dynamics and 
boundary conditions are exactly as described in the previous section. That is, it 
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evolves under 0^, its boundary conditions are eo and ey, and its microstates are 
in the set 



-T) 



(«r) 



(8) 



(formerly called e). System B, the perturbed case, has an additional transformation 
act on it at time-to- Call this transformation ip. It should not be dissipative — I do 
not want the arrow to arise from such an asymmetry alone [13]. ip is thus invertible 
and measure preserving. Successful solutions must go from eo to under the 
transformation ^ T ~ toS >tp^ to \ The microstates for system B are therefore in 



6 S = 6 O n0(-*o)^-V ( - T+to) (^) 



(9) 



Clearly, ca and cb are different. But as I shall now show, for mixing dynamics 
and for sufficiently large T, the following hold: 1) for to close to 0, the only differ- 
ences in macroscopic behavior between A and B are for t > to; 2) for to close to 
T, the only differences in macroscopic behavior between A and B are for t < to- 
This means (recalling Sec. 3) that the direction of causality follows the direction of 
entropy increase. 

The proof is nearly the same as that of the previous section. Again we use the 
time r such that the mixing decorrelation holds for time intervals longer than r. 
First consider to close to 0. The observable macroscopic quantities are the densities 
in grain- A a , which are, for t < to, 

p£(t) = p (A Q n 0<«>(eo) n ^- T \e T )) /p(e A ) , 

p2(f) = /i(A a n0W(eo)n [0<t-toty-yto-T)] (er) ) j^ eB) _ 

As before, the mixing property, for T— t > r, yields p^{t) = \i (A Q n 0^(e o )) //u(e ), 
which is the initial- value-only macroscopic time evolution. For p^, the only differ- 
ence is to add a step, Unless is diabolically contrived to undo (fi(~ u ^ for 
large u, this will not affect the argument that showed that the dependence on €t 
disappears. Thus A and B have the same macrostates before to- 

For t > to, Pa (t) continues its behavior as before. For p^(t) things are different: 

fg(t) = p (A Q n [^- to W to) ] (eo) n (t - T) (e T )) /p(e B ) (t > t ). 

Now I require T — t > t. If this is satisfied the ct dependence drops out and 

p*(t) =p[A a n [^-^^] (e )) /p(eo) - 

The shows that the effect of ift is the usual initial-conditions-only phenomenon. 

If we repeat these arguments for t such that T — t is small, then just as we 
showed in Sec. 3, the effect of ifj will only be at times t less than to- 
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This manifestation of causality has a clear intuitive origin. As the perturbation 
time, to, choose a value small enough that the system has not equilibrated [14]. All 
points, a G €a and 6 G eg, start in eo and end in ey. Working backward from to, 
what can b do? It must get to eo- Since to is less than the relaxation time, the 
places it can be are essentially the same places that a can be. However, after the 
perturbation the need to arrive in er places no macroscopic restriction on b, because 
from any coarse grain in O you can find your way into €t- This is precisely because 
working backward from T, the set €t spreads throughout O in time T — to (and 
in particular (f)( to ~ T ^eT enters the coarse grain into which ip would send b if there 
were no future conditioning). Thus, satisfying the changed boundary conditions is 
accomplished by keeping a and b close to one another before t , and allowing the 
perturbation a free hand in moving b away from a, after to. 

Integrable systems (causality) 

As before, without mixing or some kind of ergodicity our arguments fail. Nev- 
ertheless, just as frequency smearing gave relaxation, however imperfect, it can give 
causality. Again an extended time scale is needed, but the intuitive reasoning just 
given continues to hold in the integrable well. 

Consider a particular example, an oscillator of the sort discussed in Sec. 3. Take 
8x so small that the condition in Eq. (6) forces all the points to have essentially 
the time dependence, x = ut, with vT = n. The angular frequency v therefore 
satisfies v = n/T, with n selected so that uq < v < vq + 5u; for large T, this allows 
an extensive range of n values. Now consider the following perturbation: at the 
moment to, x is displaced by a macroscopic angle 7, i.e., 7 > 1/G. Solving the 
same boundary value problem gives x = vt before to, and x = ft + 7 after to- With 
the perturbation, v must satisfy u = n/T — 7/T, again yielding an extensive range 
of n values for large T. The difference between the two ranges of n values is 7/T, 
which for large enough T will be a small fraction of all n values that are common 
to the perturbed and unperturbed motion. Such n are henceforth dropped from 
consideration. 

For n values that are common to the two solution sets, the difference between 
solutions with the same n arises from the 7-dependent difference in v: 



which is independent of n. It follows that there is a difference between perturbed 
and unperturbed motion that is of order 7 through most of the time period [0, T]. 
The effect of the perturbation is felt both before and after. It thus appears that 
there is no causality, but closer consideration shows this conclusion to be wrong. 

Recall that it is only meaningful to consider perturbations that take place 
before the system has relaxed (or close enough to T that the reverse process has 
commenced). Thus the perturbation should occur for to < r osc = 1/Su. On the other 




t < t 
t > t 



(10) 



9 



t-> f-» f-» 

Figure 1. Entropy, S 1 , as a function of time for a mixing system, with two-time 
conditioning. For the left figure (2a) there is no perturbation. In the middle (2b) 
there is a perturbation at time 3. On the right (2c) the perturbation is nominally 
at time 14, although because of the way entropy is calculated (after a time step, in 
terms of the nonthermodynamic parameter t) it is effectively at time 13^. 



hand, for full relaxation the value of T should be greater than G/du, as discussed in 
Sec. 3. From Eq. (10) the maximum value of the precursor — the noncausal term — is 
to/T, just before the perturbation. These considerations are combined to yield 

— • Maximum noncausal precursor = < os ^ = — . 
7 ' T Gjov G 

But the size of a coarse grain is 1/G, so that this precursor is in fact microscopic. 

Double arrow systems 

In [5] I showed that causality obtains in opposite directions in systems con- 
taining opposite arrows. The general principle is the same as that presented here 
although a detailed presentation would be more complicated by virtue of the si- 
multaneous presence of two directions for causality. I will not provide an analytic 
demonstration and only mention this matter here for completeness. 



5. Numerical illustrations 

Although the purpose of the present article is to go beyond the numerical 
simulations of previous publications, I will illustrate the phenomena studied here. 

In Fig. 1 I show the effect of using two-time boundary values on the dynamics 
of the cat map, <f> c . (This is a map of the unit square into itself with the rule: 
x' = x + y, y' = x + 2y, mod 1. It is a mixing transformation, intensively exploited 
in ergodic theory [7] and I have used it as an example for two-time boundary value 
problems in many places, [4], etc.) 

On the left (la) there is no perturbation. The boundary conditions are that 
the system must be in a particular coarse grain (of size 0.1x0.1) at times and 16. 
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t"» t-> t-» 



Figure 2. Entropy, S, as a function of time, for oscillators with two-time condition- 
ing. The left figure (2a) shows an entire run of 200 time steps. Both perturbed and 
unperturbed motion appear. They differ in many places. Note also the half-time 
depression in entropy, as well as other, smaller reductions. In the middle (2b) is 
shown only the first few time steps. The perturbation is at time-4. Causality is 
evident. On the right (2c) only 50 time steps are used. Although from the previous 
figures it is clear that the entropy can reach its maximum values within about 10 
time steps, when the conditioning time is 50 (as in 2c) the system cannot get near 
the equilibrium value. (All figures show the same numerical range of entropy.) 



Evidently the entropy is a more or less symmetric function of time. (The statistical 
error comes from using a sample of 500 points, rather than the set e.) 

In terms of the direction of entropy increase it is that clear this arrow is a 
consequence of the boundary values given. 

In the middle figure (lb), a perturbation is applied at time-3. At that time, 
instead of C , a different map is applied (x' = 2x + 3y, y' = x + 2y, mod 1; this 
is more chaotic, and equilibrates faster than C ). The entropy is calculated (in the 
computer simulation) after time 3, "after" in the sense of the nonthermodynamic 
parameter t, but reports this as the time-3 entropy. Because of this convention one 
can think of the perturbation as taking place at time-2i. The deviation between 
the perturbed and unperturbed entropy is for times 3 and 4 (by time-5 both systems 
are in equilibrium) . Because the perturbed system received a bigger kick at time-3 
its entropy increases more rapidly. 

The point of this figure is that the difference between the curves is confined to 
times later than the perturbation. The system shows causality. 

The right hand figure (lc) shows a system that is perturbed at time-14. As 
before, the entropy-calculating convention makes this effectively a perturbation at 
time 13^. In this case, the difference between perturbed and unperturbed systems is 
before the perturbation, "before" in the sense of the nonthermodynamic parameter 
t. However, in terms of the direction of entropy increase, the entropy increase arrow 
and the causality arrow agree. This could be called reverse causality, but it is just 
normal causality with a bad choice of nonthermodynamic time parameter. 
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Finally I show what happens for harmonic oscillators. The perturbation is 
slightly different from that studied analytically above. Rather than a displacement, 
the system advances by 3u instead of v. The results are essentially the same and 
by this small change one also can see a level of robustness of the phenomenon. 

For Fig. 2 there are 25 coarse grains along the "x" direction and the frequency 
interval is of width 1/10. Thus unconditioned relaxation should take place in about 
10 time steps, but full equilibration should take about 250. For Fig. 2a and 2b 
(which are from the same run) both aspects are evident. With 200 time steps the 
system does approach S = and the relaxation time is about 10 (as is seen more 
easily in Fig. 2b). On the other hand, in Fig. 2c, with conditioning for 50 time steps, 
S is far from 0, so that the potential for 10-time-step relaxation is thwarted by the 
future condition. In all cases there is a perturbation at time-4. Fig. 2b clearly 
shows that there is causality in this case. On the other hand, for the third figure, 
the system's relaxation is so compromised that the action of the perturbation takes 
place when the system has reached its maximum, although reduced, entropy level. 
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Appendix A. Entropy increase, stochastic dynamics and coarse graining 

The formalism developed above is useful for a general derivation of entropy 
nondecrease. The derivation also holds for quantum mechanics. 
Proposition: Coarse graining a distribution function, evolving it forward, and then 
again coarse graining, either increases the entropy or leaves it unchanged. 

Let the distribution function for a classical system at time-0 be p. It is coarse 
grained to yield p, which is taken as p(0). Thus 



From Sec. 2, the entropy of this distribution is S(p a \v a ) = — ^2p a \og(p a /v a ). 
(Recall that v a = p(A a ), the volume of coarse grain a.) At time-t, p becomes 



with x& the characteristic function of a = <p*) (A a ). Now coarse grain again. This is 
the step where entropy nondecrease is forced, and I discuss its physical significance 




with 
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Table 1. Classical-quantum correspondence for the entropy increase proposition. 

O 7i (Hilbert space) p p (density matrix) 

A a 7i a (subspace) p Trace 

Xa Pa (projector) v a dimension of Ji a 

^ u t 



below. Coarse graining the function x&, the distribution function becomes 

Pit) = m = £>p« (> n <p {t) (^ a )) = Y,—p'f» 

with 

p' p = ^R{l3,a)p a and R{(3, a) = p (a n (A a )) / v a . 

a 

It follows that the entropy of the distribution p' is S ((Rp)p\vp). Thus to establish 
the proposition above I must show that S(Rp\v) > S(p\v). 

First I show that the matrix R is stochastic, i.e., its elements are nonnegative 
and each column sums to one. The sum is Y^p R(P, a) = p [(U/3A/3) U (p^(A a )~\ jv a . 
Since (p^ is measure preserving this sum gives unity. Furthermore, Rv = v, with v 
the vector of grain volumes. 

It is a theorem [15,16] that for any pair of distributions, p and q, for which 
S(p\q) is defined, and for any stochastic matrix M, > S(Mp\Mq) > S(p\q). 
Apply this to p and p'. Making use of Rv = v, the proposition stated above on 
entropy nondecrease is established for classical dynamics. 

The physical content of this derivation was incorporated in the replacement 
of p(t) by its coarse grained smearing. The assumption is that within each grain 
the phase space points have spread uniformly. Thus for physical application of 
this proposition t cannot be arbitrarily small. It must exceed a microscopic relax- 
ation time associated with the coarse grains. Moreover, in coarse graining there is a 
destruction of information — monotonic entropy behavior contradicts the entropy re- 
versal one gets using low-entropy two-time boundary conditions, as well as the more 
traditional counterexamples arising from Poincare recurrence and time reversal. 

The quantum version of this proposition involves no new mathematics, only a 
correspondence between the classical and quantum quantities. See Table 1. I find 

p 

p = y^—p a , with p a = Tr P a p and v a = Tr P a . 

Entropy is again S(p a \v a ). (The vs no longer sum to unity, but this makes no 
essential difference.) Time evolution is given by a unitary operator, Ut, acting 
in the usual way: p(t) = Utp(0)u}. Carrying through the same steps as for the 
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classical case, coarse graining, evolving in time and coarse graining again, leads to 
the same equations, but with the matrix "i?" now given by 



Stochasiticity of R is readily established and entropy nondecrease follows as above. 

Appendix B. Perturbation in a deterministic system 

A perturbation is often thought of as an act of control. In contrast, it would 
seem that imposing future conditions denies the possibility of modified evolution. 
Put differently, perturbing is an act of free will; future conditions — along with the 
deterministic context for their imposition — fly in the face of that concept. 

This is not the place for a discussion of free will, except to mention that contrary 
to the impression of many physicists, some philosophers find justification for free 
will, not from the supposed indeterminism of quantum mechanics, but from chaos 
in deterministic dynamical systems [17, p. 152]. 

But one need not imagine an independent actor to obtain the "perturbation" 
of Sec. 4. Consider the following situation, within the context of the cosmological 
scenario described in [4] or [6]. Two systems, A and B, are small parts of a big 
universe, but they are isolated, or nearly so, between the times to be used for the 
boundary value problem. The actual macroscopic boundary values for the two of 
them are the same. Now imagine that one of them, say B, is not perfectly isolated, 
but at some intermediate time, to, in its history, is struck by something coming 
in from the outside. This "outside" is simply another part of the universe, not A 
and not B. Its main properties are its lack of correlation with what is otherwise 
happening to A and B, and its ability to pack a macroscopic wallop in B. Despite 
the outside force, I still require the same boundary values for A and B. 

Now compare the macroscopic motions of A and of B. Were it not for the 
outside force, they should be the same. With the force, having changes occur only 
on one (temporal) side of the perturbation is what I call macroscopic causality. 
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